AppleTree File Formats

How to set up files defining your model and data for use in AppleTree

Before you can fit your multinomial binary tree (MBT) models to data you have to create two files. You need a model file defining your MBT model and a data file specifying the data you want to fit the model to. Both files are ASCII text files so they can be written with any text editor. If you have already created ".eqn" and ".mdt" files for use in Xiangen Hu's MBT program for MS-DOS you can use them in AppleTree as well. (There is one restriction with regard to model files, see below.)

Model Files

The first line of the model description file contains the total number of branches specified in the model. Beginning with the second line, descriptions of the tree branches follow, one line for each branch. The branch descriptions do not have to follow any particular order.

Every line has to begin with the name of the tree the branch belongs to. Like category names, tree names can consist of any sequence of alpha numerical characters.

Separated by one or more blanks, the name of the category follows to which the particular branch belongs.

After the category name and one or more blanks, the expression appears that determines the branch probability. The branch probability is a product of parameters and/or their inverse values in the sequence in which they appear in the tree branch, starting from the root. For parameter names the same rules apply as for branch and tree names.

Note that AppleTree is in two aspects more restrictive with regard to expressions determining branch probabilities than Hu's MBT program. One restriction concerns the order of parameters which is not commutative in AppleTree. This restriction was added to AppleTree in version 1.1 to make error messages for inconsistent tree definitions more helpful. Now AppleTree can specify the line number of the first expression that does not fit into the tree defined so far. If you want to use Model files where the order of parameters in the expressions does not match the order of parameters in the tree branches (starting from the root) you can use AppleTree version 1.0.3 which is still available. The second restriction concerns certain expressions which are MBT models by definition but can not be represented in a binary tree without transformations. These types of expressions cannot be used in AppleTree.

The model in the example outlined in the introduction consists of one tree (called "Pairs") with 5 branches. The model file for the example is:
 

5
Pairs Fcc pAB*pBC
Pairs Fcc pAB*(1-pBC)*pAC
Pairs Fcf pAB*(1-pBC)*(1-pAC)
Pairs Ffc (1-pAB)*pAC
Pairs Fff (1-pAB)*(1-pAC)

The easier way

You can also create model files with the built in graphical model editor. To do this, select "New Model in Graph Editor" in the "Model" menu. It opens an empty tree graph window with three pop up menus on the top. To build a new model first select "New Tree..." from the "Trees" menu. Enter a name for the tree. The tree name appears in the window together with a circle symbolizing the root node. The circle is hilited indicating that it is currently selected. You can select nodes by clicking on them. Next, select "New Category" from the "Categories" menu. Enter a name for the category. As soon as a category is specified bifurcations can be added to the tree by selecting parameters in the "Parameters" pop up menu.

Data Files

The first line of the data file must be the title for the data to follow in the subsequent lines. One line represents data from one response category. Each line begins with a unique name for the category. Category names may consist of any sequence of alpha numerical characters. The empirical frequency for this category follows after the category name, separated from the name by any sequence of blanks (tabulators or spaces). Categories can be entered in any order into the data file. Immediately after the last line of data there must be a line beginning with 3 consecutive '=' signs. Additional data sets can be appended after this line in the same way as the first one.

In the example given in the introduction, the frequency distribution of the answers may comprise 25 trials in which B and C are reproduced correctly (category Fcc), 35 items in which only B is reproduced correctly (category Fcf), 40 items in which only C is reproduced correctly (category Ffc), and 100 items in which both answers are false (category Fff). The data file for this example looks like this:

Hypothetical response frequencies after a serial learning task
Fcc 25
Fcf 35
Ffc 40
Fff 100
=======================

The easier way

You can also generate data files from models you have already defined. To do this enter a model into AppleTree. Now select "Generate Expected Data Set" from the "Data" menu. A new window opens with a data set corresponding to the active model. Now you can simply replace the category frequencies with your own data, save the document, and you are done.



Previous | Back to Main Page | Next